Initially, I sought to replicate the analysis to check my understanding of the data and approach.

1. Day-level analysis

The first test looks at the number of visits among Black and White patients by day for the three months leading up to and the three months following August 12, 2017 (on pages 125-127). Calculating these and estimating the model \(Y = \beta_0 + \beta_1 Period + \beta_2 Race + \beta_3 Period \times Race\) generates the following:

term estimate std.error statistic p.value
(Intercept) 97.800000 1.128049 86.698341 0.0000000
postpost 6.906522 1.586609 4.353009 0.0000175
raceBlack -61.911111 1.595303 -38.808382 0.0000000
postpost:raceBlack -6.599759 2.243804 -2.941326 0.0034795

The linear model version nearly exactly replicates the results provided in the chapter.

Shown as a ANOVA
post race mean sd
pre White 97.80000 12.445115
pre Black 35.88889 6.672844
post White 104.70652 14.408131
post Black 36.19565 7.102451
term df sumsq meansq statistic p.value
post 1 1183.5732 1183.5732 10.33467 0.0014236
race 1 387405.5632 387405.5632 3382.72898 0.0000000
post:race 1 990.7978 990.7978 8.65140 0.0034795
Residuals 360 41228.8432 114.5246 NA NA

Again, nearly exactly replicating the results (the F value for race is very slightly different).

Compare to 2016/2018

Comparing the same model run on the 2016 and 2018 data; I wanted to understand if the same patterns appeared in the adjacent years.

 2017  2016  2018
(Intercept) 97.800*** 96.233*** 94.722***
(1.128) (0.971) (0.817)
postpost 6.907*** 4.147** 1.615
(1.587) (1.365) (1.149)
raceBlack −61.911*** −55.478*** −60.078***
(1.595) (1.373) (1.156)
postpost × raceBlack −6.600** −5.174** −1.574
(2.244) (1.931) (1.625)
Num.Obs. 364 364 364
R2 0.904 0.910 0.940
RMSE 10.64 9.16 7.71
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

The results in 2016 look very similar to those in 2017 – a post-8/12 increase among White patients that is not present among Black patients. The 2018 observations show no pattern.

Shift to count model

Estimating the same model using a count regression (negative binomial), appropriate for countable outcomes rather than continuous outcomes.

 2017  2016  2018
(Intercept) 4.583*** 4.567*** 4.551***
(0.014) (0.012) (0.011)
postpost 0.068*** 0.042* 0.017
(0.020) (0.017) (0.015)
raceBlack −1.002*** −0.859*** −1.006***
(0.024) (0.021) (0.021)
postpost × raceBlack −0.060+ −0.068* −0.016
(0.034) (0.030) (0.029)
Num.Obs. 364 364 364
RMSE 10.64 9.16 7.71
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

This model generates the same predictions (the mean count for Black and White in the pre and post periods), but more appropriately accounts for the fact that the observation cannot be less than 0 and uses a binomial for the model of error (rather than a normal distribution – when the mean is large, the binomial look like a normal; when the mean is small, the binomal looks like a right-skewed distribution, that is, with a peak closer to zero and a longer tail to the right).

Next steps on day-level analysis

Given the above visuals and the next analyses on individual-level data, the above analyses seemed somewhat out of step with the expected outcomes. That is, the argument isn’t that the impact will persist for three months, but will be more immediate; that while the true health impacts may have a longer duration, the observable impact on hospital visits (for these diagnoses) is not expected to remain high for the full three month period.

To accommodate this, I repeated the above models with a three-epoch variable – pre-8/12, the immediate post-8/12 period, and the later post-8/12 period. First, I wanted to understand if the first several weeks post-8/12 were notably different.

 (1)
(Intercept) 24.767***
(0.496)
postweekweek1 2.519
(1.847)
postweekweek2 9.233***
(1.847)
postweekweek3 −0.481
(1.847)
postweekweek4 2.805
(1.847)
postweekweek5 6.948***
(1.847)
postweekweek6 6.948***
(1.847)
postweekpost 0.633
(0.830)
raceBlack −15.089***
(0.702)
postweekweek1 × raceBlack 0.089
(2.612)
postweekweek2 × raceBlack −10.197***
(2.612)
postweekweek3 × raceBlack 0.089
(2.612)
postweekweek4 × raceBlack −4.197
(2.612)
postweekweek5 × raceBlack −7.625**
(2.612)
postweekweek6 × raceBlack −6.054*
(2.612)
postweekpost × raceBlack −1.791
(1.174)
Num.Obs. 364
R2 0.774
RMSE 4.60
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

Almost all of the dynamics are among White patients – the interactive effects for race and period show essentially offsetting effects for Black patients. That is, while the average visit for a relevant diagnoses among White patient ranges across these periods (going up in week 2, falling back to the pre-mean in weeks 3 and 4, and rising again in weeks 5 and 6), the period effects for Black patients are significant only when the period effects are signification for White patients, but in the opposite direction.

2. Individual level analysis

The second, and central, test looks at the likelihood that patients who select into UVA have a diagnoses in the relevant domains (anxiety/depression, MUPS, alcohol related), comparing patient visits in the two weeks prior to and subsequent to August 12, 2017 (with tests of other windows). The relevant model (pages 126-129) is \(Prob(Y = 1) = \beta_0 + \beta_1 Period + \beta_2 Day + \beta_3 Race + \beta_4 Period \times Day + \beta_5 Period \times Race + \beta_6 Day \times Period \times Race\) with controls for age, sex, and an indicator for whether the day is on a weekend.

Estimating that model (as a linear probability model) for each of the relevant diagnoses generates:

The values are not the same as those presented in the chapter, though the patterns for signs and significance are the same.

MUPS

adm_year race term estimate std.error statistic p.value
2016 Black pre_post_yearly -0.0317338 0.0532612 -0.5958147 0.5514163
2016 Black days -0.0023771 0.0052392 -0.4537081 0.6501246
2016 Black pre_post_yearly:days -0.0004547 0.0067978 -0.0668881 0.9466824
2016 White pre_post_yearly -0.0023544 0.0344537 -0.0683349 0.9455241
2016 White days 0.0024831 0.0033616 0.7386672 0.4601731
2016 White pre_post_yearly:days -0.0015620 0.0043727 -0.3572130 0.7209602
2017 Black pre_post_yearly 0.1397783 0.0592751 2.3581295 0.0185629
2017 Black days 0.0034313 0.0057140 0.6005178 0.5483000
2017 Black pre_post_yearly:days -0.0185293 0.0072474 -2.5566829 0.0107173
2017 White pre_post_yearly 0.0158091 0.0342184 0.4620066 0.6441120
2017 White days 0.0023871 0.0032168 0.7420691 0.4581069
2017 White pre_post_yearly:days -0.0013568 0.0042011 -0.3229615 0.7467482
2018 Black pre_post_yearly 0.0586379 0.0579729 1.0114701 0.3120636
2018 Black days 0.0042350 0.0055540 0.7625095 0.4459560
2018 Black pre_post_yearly:days -0.0085772 0.0072946 -1.1758340 0.2399725
2018 White pre_post_yearly -0.0312464 0.0340153 -0.9185979 0.3583911
2018 White days -0.0006505 0.0032947 -0.1974312 0.8435055
2018 White pre_post_yearly:days 0.0001318 0.0042689 0.0308748 0.9753718

Alcohol

adm_year race term estimate std.error statistic p.value
2016 Black pre_post_yearly -0.0145050 0.0124053 -1.1692575 0.2425422
2016 Black days -0.0013208 0.0012203 -1.0823600 0.2793195
2016 Black pre_post_yearly:days 0.0023184 0.0015833 1.4642932 0.1433871
2016 White pre_post_yearly -0.0034457 0.0105630 -0.3262039 0.7442953
2016 White days -0.0005127 0.0010306 -0.4974795 0.6188913
2016 White pre_post_yearly:days 0.0010789 0.0013406 0.8048058 0.4210024
2017 Black pre_post_yearly 0.0300740 0.0120741 2.4907854 0.0129107
2017 Black days 0.0018423 0.0011639 1.5828868 0.1137699
2017 Black pre_post_yearly:days -0.0038160 0.0014763 -2.5848961 0.0098844
2017 White pre_post_yearly -0.0090604 0.0096768 -0.9363040 0.3491964
2017 White days -0.0008338 0.0009097 -0.9165527 0.3594550
2017 White pre_post_yearly:days 0.0011209 0.0011880 0.9434967 0.3455072
2018 Black pre_post_yearly -0.0132176 0.0161378 -0.8190460 0.4129773
2018 Black days -0.0007135 0.0015460 -0.4615095 0.6445448
2018 Black pre_post_yearly:days 0.0018873 0.0020306 0.9294571 0.3529017
2018 White pre_post_yearly -0.0097727 0.0103721 -0.9422185 0.3461683
2018 White days -0.0004116 0.0010046 -0.4096542 0.6820934
2018 White pre_post_yearly:days 0.0015249 0.0013017 1.1714860 0.2415108

Anxiety/depression

adm_year race term estimate std.error statistic p.value
2016 Black pre_post_yearly -0.0048338 0.0124964 -0.3868131 0.6989661
2016 Black days 0.0002292 0.0012292 0.1864620 0.8521154
2016 Black pre_post_yearly:days 0.0005381 0.0015949 0.3374001 0.7358769
2016 White pre_post_yearly -0.0045541 0.0076859 -0.5925232 0.5535498
2016 White days -0.0009334 0.0007499 -1.2447158 0.2133439
2016 White pre_post_yearly:days 0.0010080 0.0009755 1.0333437 0.3015353
2017 Black pre_post_yearly -0.0083553 0.0128821 -0.6485943 0.5167526
2017 Black days -0.0000508 0.0012418 -0.0409396 0.9673524
2017 Black pre_post_yearly:days 0.0010843 0.0015751 0.6883883 0.4913712
2017 White pre_post_yearly -0.0225232 0.0091382 -2.4647342 0.0137703
2017 White days -0.0016878 0.0008591 -1.9646588 0.0495512
2017 White pre_post_yearly:days 0.0027880 0.0011219 2.4850343 0.0130111
2018 Black pre_post_yearly 0.0271333 0.0134378 2.0191826 0.0437646
2018 Black days 0.0013499 0.0012874 1.0486004 0.2946439
2018 Black pre_post_yearly:days -0.0038752 0.0016908 -2.2918717 0.0221434
2018 White pre_post_yearly -0.0007134 0.0074659 -0.0955536 0.9238825
2018 White days -0.0003355 0.0007232 -0.4639248 0.6427404
2018 White pre_post_yearly:days 0.0009412 0.0009370 1.0044862 0.3152377

I’m not what’s causing the differences. I tried changing the windows of days by one or two on either side, but I get closest to the days listed in the table by choosing

Next steps on individual analysis

  1. Switch to logit model
  2. 2017 model with local/not local – set this up as a single model to leverage both distance in time and space as quasi-experimental elements.
  • Question on what constitutes local: currently the location variable includes only 22901, 22902, 22903, 22905, 22906, 22908, 22910; these are primarily Charlottesville zip codes. Should Albemarle, or those in the urban periphery of Charlottesville, also be included?
  1. Model with all years and interactions (consider for local only?) – formalize the across-year difference as a test of.
  2. Visualize relevant effects
  3. Generate robustness table of other window for appendix

Remaining question

A check on diagnoses/ICD 10 codes. There are more codes included within each diagnoses than indicated in the appendix.

  • For anxiety/depression
## # A tibble: 21 × 1
##    icd_10_code
##    <chr>      
##  1 F32.0      
##  2 F32.1      
##  3 F32.2      
##  4 F32.3      
##  5 F32.9      
##  6 F33.0      
##  7 F33.2      
##  8 F33.3      
##  9 F41.0      
## 10 F41.8      
## 11 F41.9      
## 12 F43.0      
## 13 F43.21     
## 14 F43.23     
## 15 F43.25     
## 16 R07.89     
## 17 R10.9      
## 18 R11.2      
## 19 R45.851    
## 20 S69.81XA   
## 21 Z41.8

The data also includes R07.89, R10.9, R11.2, R45.851, S69.81XA, Z41.8.

  • For MUPS
## # A tibble: 11 × 1
##    icd_10_code
##    <chr>      
##  1 F44.5      
##  2 G40.89     
##  3 M54.5      
##  4 M54.9      
##  5 R06.00     
##  6 R07.9      
##  7 R10.9      
##  8 R42        
##  9 R51        
## 10 R53.83     
## 11 R56.9

The data also includes G40.89

  • For Alcohol
## # A tibble: 27 × 1
##    icd_10_code
##    <chr>      
##  1 F10.10     
##  2 F10.120    
##  3 F10.121    
##  4 F10.129    
##  5 F10.14     
##  6 F10.159    
##  7 F10.180    
##  8 F10.188    
##  9 F10.20     
## 10 F10.220    
## 11 F10.221    
## 12 F10.229    
## 13 F10.230    
## 14 F10.231    
## 15 F10.232    
## 16 F10.239    
## 17 F10.24     
## 18 F10.251    
## 19 F10.259    
## 20 F10.288    
## 21 F10.920    
## 22 F10.929    
## 23 F10.94     
## 24 F10.951    
## 25 F10.980    
## 26 F10.982    
## 27 F10.99

The data also includes F10.159, F10.982